-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialize PartialOrd<A> for [A] where A: Ord
#39642
Conversation
This way we can call `cmp` instead of `partial_cmp` in the loop, removing some burden of optimizing `Option`s away from the compiler. PR #39538 introduced a regression where sorting slices suddenly became slower, since `slice1.lt(slice2)` was much slower than `slice1.cmp(slice2) == Less`. This problem is now fixed. To verify, I benchmarked this simple program: ```rust fn main() { let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>(); v.sort(); } ``` Before this PR, it would take 0.95 sec, and now it takes 0.58 sec. I also tried changing the `is_less` lambda to use `cmp` and `partial_cmp`. Now all three versions (`lt`, `cmp`, `partial_cmp`) are equally performant for sorting slices - all of them take 0.58 sec on the benchmark.
src/libcore/slice.rs
Outdated
self.len().partial_cmp(&other.len()) | ||
} | ||
} | ||
|
||
impl SlicePartialOrd<u8> for [u8] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this impl not be extended to all A: Ord
so it would simply use Some(SliceOrd::compare(self, other))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good suggestion, thanks! I've updated the code.
src/libcore/slice.rs
Outdated
Some(SliceOrd::compare(self, other)) | ||
} | ||
} | ||
|
||
impl SlicePartialOrd<u8> for [u8] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this specialization could be removed now, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, removed.
@bors: r+ |
📌 Commit a344c12 has been approved by |
…ichton Specialize `PartialOrd<A> for [A] where A: Ord` This way we can call `cmp` instead of `partial_cmp` in the loop, removing some burden of optimizing `Option`s away from the compiler. PR #39538 introduced a regression where sorting slices suddenly became slower, since `slice1.lt(slice2)` was much slower than `slice1.cmp(slice2) == Less`. This problem is now fixed. To verify, I benchmarked this simple program: ```rust fn main() { let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>(); v.sort(); } ``` Before this PR, it would take 0.95 sec, and now it takes 0.58 sec. I also tried changing the `is_less` lambda to use `cmp` and `partial_cmp`. Now all three versions (`lt`, `cmp`, `partial_cmp`) are equally performant for sorting slices - all of them take 0.58 sec on the benchmark. Tangentially, as soon as we get `default impl`, it might be a good idea to implement a blanket default impl for `lt`, `gt`, `le`, `ge` in terms of `cmp` whenever possible. Today, those four functions by default are only implemented in terms of `partial_cmp`. r? @alexcrichton
…d, r=alexcrichton Specialize `PartialOrd<A> for [A] where A: Ord` This way we can call `cmp` instead of `partial_cmp` in the loop, removing some burden of optimizing `Option`s away from the compiler. PR rust-lang#39538 introduced a regression where sorting slices suddenly became slower, since `slice1.lt(slice2)` was much slower than `slice1.cmp(slice2) == Less`. This problem is now fixed. To verify, I benchmarked this simple program: ```rust fn main() { let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>(); v.sort(); } ``` Before this PR, it would take 0.95 sec, and now it takes 0.58 sec. I also tried changing the `is_less` lambda to use `cmp` and `partial_cmp`. Now all three versions (`lt`, `cmp`, `partial_cmp`) are equally performant for sorting slices - all of them take 0.58 sec on the benchmark. Tangentially, as soon as we get `default impl`, it might be a good idea to implement a blanket default impl for `lt`, `gt`, `le`, `ge` in terms of `cmp` whenever possible. Today, those four functions by default are only implemented in terms of `partial_cmp`. r? @alexcrichton
☀️ Test successful - status-appveyor, status-travis |
This way we can call
cmp
instead ofpartial_cmp
in the loop, removing some burden of optimizingOption
s away from the compiler.PR #39538 introduced a regression where sorting slices suddenly became slower, since
slice1.lt(slice2)
was much slower thanslice1.cmp(slice2) == Less
. This problem is now fixed.To verify, I benchmarked this simple program:
Before this PR, it would take 0.95 sec, and now it takes 0.58 sec.
I also tried changing the
is_less
lambda to usecmp
andpartial_cmp
. Now all three versions (lt
,cmp
,partial_cmp
) are equally performant for sorting slices - all of them take 0.58 sec on thebenchmark.
Tangentially, as soon as we get
default impl
, it might be a good idea to implement a blanket default impl forlt
,gt
,le
,ge
in terms ofcmp
whenever possible. Today, those four functions by default are only implemented in terms ofpartial_cmp
.r? @alexcrichton